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Recent researches on complex systems highlighted the so-called super-linear growth phenomenon. 
As the system size P measured as population in cities or active users in online communities increases, 
the total activities X measured as GDP or number of new patents, crimes in cities generated by 
these people also increases but in a faster rate. This accelerating growth phenomenon can be 
well described by a super-linear power law X oc P 7 (7 > 1). However, the explanation on this 
phenomenon is still lack. In this paper, we propose a modeling framework called growing random 
geometric models to explain the super-linear relationship. A growing network is constructed on an 
abstract geometric space. The new coming node can only survive if it just locates on an appropriate 
place in the space where other nodes exist, then new edges are connected with the adjacent nodes 
whose number is determined by the density of existing nodes. Thus the total number of edges can 
grow with the number of nodes in a faster speed exactly following the super-linear power law. The 
models cannot only reproduce a lot of observed phenomena in complex networks, e.g., scale- free 
degree distribution and asymptotically size-invariant clustering coefficient, but also resemble the 
known patterns of cities, such as fractal growing, area-population and diversity-population scaling 
relations, etc. Strikingly, only one important parameter, the dimension of the geometric space, can 
really influence the super-linear growth exponent 7. 
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I. INTRODUCTION 

The super-linear phenomenon is described as a scaling 
relation, 



X = cP^. 



(1) 



X and P may have different representations in different 
systems. In urban systems, for example, X represents 
GDP, R&D investments, crimes or the number of new 
patents, and P represents the population [lHl]- In online 
communities, X is the total number of activities (tags, 
blogs) generated by the users, and P is the total number 
of active users (who at least generate one activity) [5j . In 
language, X is the total number of words in an article,/ 5 
is the number of distinct words in the same article |(| lZI- 
In equation [TJ 7 is an exponent to describe the relative 
speed of X respective to P. A large number of empirical 
studies reported that 7 is always falling into the interval 
[0, 2). For example, Q pointed out 7s are 1.17 ~ 1.48 for 
different online communities. [1] finds 7s are 1.15 ~ 1.26 
for cities in different countries. However, the exponent 
for the relationship of population and GDP can approach 
to 1 if the scale of the system is large and interactions 
among people are weak. For example, @ found that the 
exponent is almost 1 for countries. According to our un- 
published results, the scaling relationship is almost linear 
for provinces and states. So far, we know the equation 
[T] holds for a large number of different systems, but the 
exponents are always different system by system. While, 
the next question is what is the underlying mechanism 
of this remarkable phenomenon? 
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There are already some studies trying to explain 
the super-linear growth phenomenon. For instance, 
Arbesman et al. tried to attribute the super-linear phe- 
nomenon to the properties of the interaction network [ij], 
but their model takes several assumptions on the net- 
work which are hardly to find the correspondence in the 
real systems. While [a, 0] tried to link the universal pat- 
terns in distributions (e.g. DGBD distribution in [5j and 
Zipf law in [7]) to the super-linear growth pattern by 
large number of empirical data. Despite a strong con- 
nection between size-dependent distributions and super- 
linear growth is revealed Q, the underlying mechanisms 
are still unknown since size-dependent distribution and 
super-linear growth actually are the two different expres- 
sions for the same law[5j. 

In the network community, researchers have found 
many empirical networks are of a so-called accelerating 
growth phenomenon [T3, [H| which also states the power 
law relationship between the number of edges (X) and 
the number of nodes (P), but they didn't try to explain 
this fact. Leskovec et al. [HI re- found the accelerat- 
ing growth pattern and re-name it as the densification 
phenomenon. He tried to build a forest fire model to un- 
derstand its origin[l2|. But due to the complexity of this 
model, he later developed a totaly new one called kro- 
necker graph model. As claimed in (l3| . densification 
phenomenon is a mathematical property of kronecker 
products. Although it succeeds to fit many empirical 
network data, the explanations and real life grounding 
are still lack. Also, in kronecker graph model, the inter- 
cept of the power law relation between number of nodes 
and edges, i.e., c in equation [1] must be 1. This strong 
assumption is hardly supported by empirical data. How- 
ever, these studies make us clear that the super-linear 
growth pattern widely existing in various systems can be 



discussed on a network background. Recently, by ana- 
lyzing the data of cell-phone communication networks in 
different cities, Schlapfer et al.Q found the accelerating 
growth exponent 7 is of the same value as the super- linear 
growth exponent of cities and the clustering coefficients 
in these networks are size invariant. This coefficient al- 
most determines the super-linear growth exponents [Tij]. 
Therefore, as the size of the network increases, the clus- 
tering coefficient must keep unchanged so that the ac- 
celerating growth or densification pattern as a systemic 
results can emerge. However, their model cannot answer 
what is the origin of the size-invariant clustering coeffi- 
cient, so the super- linear growth puzzle also remained un- 
solved. More recently, Bettencourt developed a network 
model to explain the origin of the super-linear growth in 
urban systems [f 5]. Although this model can fit the em- 
pirical data of cities very exactly, it is complicated and 
depends on a set of assumptions which are hardly tested. 

Despite several models have been presented to explain 
the super-linear growth scaling law, we still cannot find 
one simple model with minimum parameters while can 
reproduce as many as possible patterns observed in em- 
pirical systems. In this paper, we propose a new growing 
network modeling framework in geometric space called 
growing random geometric models to explain the super- 
linear phenomenon. It uses very basic but simple mecha- 
nism to reproduce a lot of observed patterns in cities and 
networks. Strikingly, we found the super-linear exponent 
is determined only by one important parameter, d, the 
dimension of the geometric space. 



II. BASIC MODEL 

Inspired by the niche model in food web studies (l6|. we 
can construct a spatial growing network in an abstract 
geometric space. If the new coming node just locate on 
the right place which can match existing nodes, then the 
new one can survive and some new links are built accord- 
ingly. 

This basic idea is very similar to the well devel- 
oped model called random geometric graph fl7j and disk 
percolation [l8| . the main difference is the growing mech- 
anism in our model. Unlike some well known growing 
network models [ill , the number of new coming edges 
is not given but determined by the existing nodes. We 
will introduce one of the simplest model of this frame- 
work in this section and left more interesting extensions 
to the following sections. 

The basic model contains following elements: a geo- 
metric space .y which can be modeled as a d dimen- 
sional Euclidean space, in which the coordinates can be 
any real numbers, that is, y — &? d , where Si is the set 
of real numbers. A relation as the matching rule R is 
defined on y, R € y x y. In the basic model, we can 
set the simplest matching rule as the Euclidean distance 
between two points cannot be exceed a given parameter 




FIG. 1. A 2-d Geometric Space of the Basic Model. Black 
disks are existing agents, the red disk is a new coming agent 
who will survive while the gray one is the agent who cannot 
survive. The dark lines are links between agents, the dashed 
lines are the adding links between the red agent and existing 
agents. 

r, that is: 

R= {(x,y) e .Y x y\\\x-y\\<r} (2) 

The simplest initial condition is the geometric space 
contains only a single agent locates in the origin G 
y. We of course can design more complicated initial 
conditions in the extended model. 

The growing process of this model is like this. 

Step 1: In each time step, one new agent i is added in 
the system with a randomly assigned coordinate Xi € y, 
if some existing agents' coordinates match the new one, 
then i may survive, otherwise it may die immediately. 
We denote the set of existing agents who have matched 
with agent i as Mi = {j\ \\xi~ Xj \ \ < r}, then new coming 
agent i can only survive and exist in the system (keeps 
its coordinate fixed) forever if Mi 7^ 0. 

Step 2: If the new coming agent survives, then new 
links are added from the agents in Mi to the new one i 
(As shown in figure [1]). 

Then, we will repeat these two steps to obtain a grow- 
ing network. Through studying the relationship between 
the total number of edges X and the number of nodes in 
the network, P, we can test the super-linear growth law. 

Although the growing process is very simple and seems 
homogenous, the resulting network is very uneven both 
in time and space. First, the new agent is added in the 
system in a random place of the geometric space, but only 
if the random place is surrounded by existing agents, the 
new place will be occupied. Thus the density of existing 
agents is uneven in the geometric space, the network itself 
can be regarded as a result of "crystalization" . Second, 
the growing process is uneven in time because a much 
slower growing speed is expected in the initial process 
than the following steps. 

However, in the simulation, we have to use a trick to 
avoid the problem of random searching on an infinite 
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FIG. 2. Number of Edges v.s. Number of Nodes in 1-d Basic 
Model with Different r 




FIG. 3. Parameters 7, c change with r, all the simulations are 
done by 10 times with 10000 time steps. 



growth phenomenon doesn't dependent on the parameter 
r. 



B. Two Dimensional Model 



space: a new coming agent's coordinate Xi is not ran- 
domly assigned in the whole geometric space 5? but a 
much smaller subset ST = {y\y G [r]x m , tjxm]}, where 
77 = 5 in the simulations, x m , xm are minimum and max- 
imum coordinates along all dimensions. In a word, the 
new coming agent is from a d-dimensional box covering 
all existing agents randomly. This trick can accelerate 
our program dramatically but take no effect on the final 
results. 



A. One Dimensional Model 

Let's consider the simplest case of our basic model, the 
geometric space is a one dimensional line, i.e., d = 1. In 
this simplest case, the super-linear growth phenomenon 
can be generated. 

Figure shows three simulations with different r. 
We found at first all the simulations show super-linear 
growth, that is the number of edges v.s. the number of 
nodes in different time step has a power law relation with 
exponent larger than one. Second, all the curves of X v.s. 
P almost overlap each other on the plot which means the 
fittings by equation [T] have nearly same parameters. So, 
the exponents in equation [1] are independent on the pa- 
rameter r. This point can be confirmed by the larger 
scale simulations as shown in [3] 

We observed clearly both 7 and c fluctuate around the 
mean values in different r. Therefore, the super-linear 



Besides the basic phenomena shown in the 1-d model, 
2-d model shows more interesting patterns. In this case, 
the geometric space itself can be illustrated by a 2-d pic- 
ture. And the network formed by the model is a spatial 
network, so we can show the networks in different steps. 

From figure[31 we know that the growing network in the 
geometric space is very uneven. We found the density of 
agents in the center of the geometric space is much higher 
than the peripheral places. Actually, the network in the 
2-d geometric space is a fractal. That is the number of 
occupied lattice scales as the measurement size with the 
power law exponent (fractal dimension) a. This point 
can be confirmed by the box-counting method, and the 
fractal dimensions is calculated in different time steps. 

According to the box-counting method, we know the 
asymptotic fractal dimension is about 1.83. All the di- 
mensions a during the simulation are in between 1 and 
2, therefore, the spatial networks are fractals. 



C. Three Dimension Model 

So far all the geometric spaces we have discussed are 
very abstract. In this subsection we will discuss a more 
concrete model: an interaction network of a city. Each 
node on the network is an individual living in the city, 
and the links between the nodes stand for the interac- 
tions (e.g. phone connection or friendship connection). 
The geometric space is a 3 dimensional Euclidean space 
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FIG. 4. Network formation in 2-d basic model (r = 10 ) of 
different time steps. 



FIG. 5. Box counting calculation of fractal dimension a of 
the growing network in different time steps (r = 10 5 ). 



in which two dimensions stand for the geographic space 
(since a city locate on a 2-dimensional plane of course) 
and the left single dimension is the similarity space. The 
basic matching rules are the same as the previous model 
settings. Hence, a connection is built only if two individ- 
uals locate very closed in the geographic space and have 
common interests (similarities). 

In this model, all the phenomena we have discussed 
in last sections can be also observed. For example, the 
super-linear exponent is about 1.218, the fractal dimen- 
sion of the network in 3-d space is about 2.36. While the 
projection of the network on the geographic space (2-d 
world) is also a fractal with dimension around 1.77 be- 
ing smaller than the one's in the 2-d model. Thus the 
complexity of the 2-d projection of the 3-d network is 
smaller than the 2-d model because the new introduc- 
ing dimension of the model make the matching criterion 
stricter. 

Besides the fractal dimension of the network, we can 
also study the relationship between area and popula- 
tion which is comparable to the empirical studies in 
cities [13, HH . In our model, we calculate city's area by the 
following method. On the 2-d geographic space, we select 
a specific resolution as our observational scale. Then we 
use the given resolution to rasterize the whole geographic 
space, after that, we count the number of occupied boxes 
as the area just like the box-counting method in the frac- 
tal dimension calculation. Because the box occupied by 
multiple agents is treated as one unit of area, the increase 
speed of area is much slower than the speed of popula- 
tion increasing. Therefore, a sub-linear area-population 



relationship can be obtained as shown in figure [5] 

The area and population has a sub-linear power law 
relation: 



(3) 



Where, A is the area of cities, and f3 is the exponent. 
As shown in figure [HI all (3s are smaller than 1 and fall 
into the interval [0.38 0.97]. This result is consistent with 
the observed exponents [0.33 0.91] of real cities[20, l2lj - 
We also show how the area-population relation depend 
on resolution. As the size of the box increases, j3 in- 
creases also. Because city is a fractal object, the area as 
a macro measurement is dependent on the measurement 
scale certainly. 

We can use the similar method to study the similarity 
space and found similar sub-linear law between diversity 
(different types of features) and population, 



D oc P*>, 



(4) 



In figure [3 all the exponents rjs are around 0.3 which 
are much smaller than f3 and more stable with respect to 
different resolutions. 

Beyond the spatial properties and super-linear growth, 
we can also discuss other network features, and how do 
they change with the size of the system. 

The degree distributions are not power laws but 
Weibull distributions. This is inconsistent with the 
empirical observation that the degree distributions are 
heavy tails. However, the clustering coefficient asymp- 
totically unchange with the size of the network. This is 
also observed in empirical dataflij. 
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FIG. 6. Area-population relationship in different resolutions. 
Here, resolution is the size of the box to cover the set of nodes 
in the 2-d space, area is the number of boxes counted by the 
given resolution 

Interestingly, through the simulations in 1,2 and 3 di- 
mensions, we found the super-linear growth exponents 
depend not on r but the spatial dimension d (as shown 
in figure flO|). To see how does super-linear growth ex- 
ponent decay with the spatial dimension, we have done 
more experiments as shown in figure [TT] 



III. MODEL EXTENSIONS 

We have known what diverse and interesting patterns 
can the basic model exhibit, however, we can add a lit- 
tle of complexity on the basic one to make it closer to 
the reality. We will mainly consider several possible ex- 
tensions. Firstly, we can study the geometric space with 
limitations. Secondly, we can add more heterogeneity in 
our model. 



A. Finite Geometric Space 

We will firstly extend our model to a finite geometric 
space. The simplest finite one that we can imagine is the 
unit interval [0, 1] on the real line. In this case, the super- 
linear exponent will depend on the interaction radius r 




Population 



FIG. 7. Diversity-population relationship in different area 
resolution. 



because the space is not scale-free anymore but have a 
maximum characteristic scale which is the upper bound 
of the radius. So, figure [T2l shows different straight lines 
with different r. 

As we observed, the slope of the straight line, i.e. the 
super-linear exponent increases with the interaction ra- 
dius r. Even when the radius is large enough so that 
the scale is comparable with the maximum range of the 
geometric space itself, then the power law exponent 7 is 
approaching the maximum possible value 2. 

The unevenness of time, i.e., the waiting time between 
two agents adding in the network, can be investigated in 
this extended model. In section UH we have mentioned 
that a trick has to be used to accelerate the simulation 
process otherwise infinite time should be waited to add a 
new node when the geometric space is infinite. However, 
in this extension, we can directly simulate the whole ran- 
dom searching process without using this trick. In every 
time step, a new agent with a random position in the in- 
terval is added and survive with the condition that some 
old agents are close to him. Therefore, as time goes by, 
the growing speed of the whole network will be acceler- 
ated since the number of existing nodes become larger 
and larger. Instead plotting the waiting time between 
any two survival agents, we study the cumulative time, 



6 



0.08 



T=2000 

X=12.6465,k=2.1644 
T=4000 

X=14.8663,k=2.0673 
T=6000 

X=15.9782,k=1.9904 
T=8000 

\=16.8666,k=1.9589 
T=10000 

X=17.4927,k=1.9165 




20 30 
Degree 



FIG. 8. Degree Distributions with Time. We use weibull dis- 
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i.e. the total time elapse t so far versus the total num- 
ber of agents survived before t. We found a asymptotic 
power law between these two variables. 



t(xP x 



(5) 



From figure [T"3l we know the time intervals between two 
agents added into the whole network scales with the size 
of the network. And the exponent decreases with the 
interaction radius r. When the radius r is comparable 
with the scale of the geometric space, the exponent x is 
approaching 1. Therefore, the growing process is actually 
a fractional dynamical process. 



B. Finite Resolution 

In the previous subsection, we have considered the up- 
per bound of the size of the geometric space, the lower 
bound will be considered in this subsection. 

At first, we can model the whole geometric space as a 
discrete cellular space. Each agent can only occupy one 
single cell. So the new node can only exist only if (1) 
they can build a link with at least one existing agent; 
(2) the new agent's position in the geometric space is not 
occupied by any existing agent. By adding this new rule, 
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FIG. 9. Clustering coefficient change with size P 
y-r in Different Dimension 
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FIG. f 0. Dependence of 7 on r in different dimensions. 



we find the super-linear exponent is dependent on the 
interaction radius r. 

We set the minimum resolution as 1 , and the maximum 
range of the geometric space as 10 5 in all the following 
simulations. The interaction radius r changes from 1 
to 10 , the dependence of super-linear exponents on the 
radius r in all d — 1, 2, 3 space is shown in figure 

We see in all cases when r = 1, the exponents are close 
to 1 , which means the networks are very regular and like 
lattices. As the interaction radius r increases, this con- 
straint becomes weak, so the exponents will increase also. 
When r is in the intermediate stage, the exponent 7s are 
always independent on r because both the upper limi- 
tation and lower limitation have no any constraints on 
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FIG. 11. Decay of 7 with the spatial dimension d. 
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FIG. 12. X oc in Unit Interval 



the systemic processes. The stationary exponents are al- 
most identical to the ones in the free geometric cases. 
When r is big enough, the upper bound of the geometric 
space will influence the behaviors of the 7s, so the ex- 
ponents increase with the interaction radius r to reach 
the maximum value 2. In these extensions, we know the 
parameter r can affect the super-linear exponent 7 due 
to the space limitation effect. 



10° 



10' 



I 10 2 



10" 



10" 




a r=0.00056234 

X=0.52434 
x r=0.0031623 
— x=0.55813 
r=0.017783 

X=0.62569 

+ r=0.1 
X=0.97951 



10 10 10 10 

P(Number of Nodes) 



10 H 



FIG. 13. The scaling relation between population P with 
simulation time t, t oc P x in different r 
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FIG. 14. The dependence of super-linear exponents on the 
interaction radius r in both d = 1,2,3 dimensional spaces. 
All simulations are run 20 times with 10 4 cycles 
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C. Heterogenous Models 

We found the degree distributions of the basic model 
are not power laws as showed in many empirical net- 
works. The essential reason is the homogeneity of the 
basic model, i.e. all the interaction radiuses are the 
same. This strong assumption is not supported by real 
life. Thus, in this subsection, we will consider a heteroge- 
nous model with random interaction radius. 

In the first attempting, we suppose r of each agent is 
a random number following exponential distribution, so 
the cumulative function of this variable is, 



Degree Distributions when r follows Exponential Distribution 



Pr{r > x > 0} 



Xexp(-Xy)dy. 



(6) 



In this way, we can generate both super-linear growth 
and scale-free degree distribution patterns. That is the 
resulted degree distribution has a power law tail. 



Pr{k > x > m} 



{p-l)mP- l y-Pdy, (7) 



where, k is the random variable for degrees, m is the lower 
degree of power law tail, p is its exponent. As p increases, 
the heterogeneity of the degrees becomes larger. Figure 
1151 shows the cumulative degree distributions of several 
networks with different A values. 

Exponential distribution of r is not the only choice, 
we can use other distribution density function to repro- 
duce the power law degree distribution and super-linear 
growth pattern. For example, we replace the formula [6] 
to: 



Pr{r > x > 0} = 



^exp(-4)dy, (8) 



That means r follows the half normal distribution (in 
each time step, draw a random number with normal dis- 
tribution and take the absolute value). Figure [TrJl shows 
the degree distributions. 

Comparing figure [T5] and [TC] we know the exponent of 
the degree distribution in normal distribution model is 
larger than the one in exponential distribution model. 
That means the heavy tail phenomenon is more insignif- 
icant than the former case. 

To see how do the exponents 7 and p change with the 
parameters A and <r, we have conducted larger scale ex- 
periments. The results are shown in figure [T7J Both 
exponents 7 and p are almost invariant when A and a 
change. And because all the experiments are done in 2-d 
space, the exponent 7 is almost identical to the values 
in the basic model. Therefore, although we have to in- 
troduce two new parameters A and a, the super-linear 
exponent is independent on them. 




FIG. 15. Degree distributions in heterogenous model in 2-d 
simulation with r exponentially distributed in different As. ps 
are the power law exponents of the degree distributions 
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FIG. 16. Degree distributions in heterogenous model in 2-d 
simulation with r exponentially distribute and different a. ps 
are the power law exponents of degree distributions 
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FIG. 17. How exponents 7 (allometry) and p (scale-free) 
change with parameters A (exponential distribution) and a 
(normal distribution) in 2-d space 



the main advantages of these models is they all exhibit 
super-linear growth or densification, accelerating growth 
phenomenon. 

Besides the super-linear growth behavior, this simple 
model can also show a lot of scaling behaviors. We used 
a set of exponents to characterize these scalings. a is the 
fractal dimension of the spatial network in the space, (3 is 
the exponent of area and population, r\ characterize the 
scaling between diversity of similarities and population, 
X describe the power law relation between time and the 
size of the system and p is the power law exponent of 
the degree distribution in the extended model. All these 
scaling behaviors indicate that the growing random ge- 
ometric graph is an anomalous object that is governed 
by some unknown fractional dynamics. Further studies, 
especially the mathematical analysis are deserved. 

Although we have discussed several interesting exten- 
sions toward the original model, more extensions are 
needed. For example, we can grow the network not only 
in the Euclidean space but other interesting space, e.g. 
hyperbolic space [22|. And other possible matching rules 
can be considered. Maybe more interesting phenomena 
will emerge. 

Finally, this is only the first step of this model, both 
the theoretical analysis and empirical tests are needed in 
the future studies. 



IV. DISCUSSION 

In this paper, we introduce a new growing network 
model called growing random geometric graph. Actually, 
this is a modelling framework that can be used to model 
various complex networks and other systems. One of 
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